Binpairs: Utilization of Illumina Paired-End Information for Improving Efficiency of Taxonomic Binning of Metagenomic Sequences
نویسندگان
چکیده
MOTIVATION Paired-end sequencing protocols, offered by next generation sequencing (NGS) platforms like Illumia, generate a pair of reads for every DNA fragment in a sample. Although this protocol has been utilized for several metagenomics studies, most taxonomic binning approaches classify each of the reads (forming a pair), independently. The present work explores some simple but effective strategies of utilizing pairing-information of Illumina short reads for improving the accuracy of taxonomic binning of metagenomic datasets. The strategies proposed can be used in conjunction with all genres of existing binning methods. RESULTS Validation results suggest that employment of these "Binpairs" strategies can provide significant improvements in the binning outcome. The quality of the taxonomic assignments thus obtained are often comparable to those that can only be achieved with relatively longer reads obtained using other NGS platforms (such as Roche). AVAILABILITY An implementation of the proposed strategies of utilizing pairing information is freely available for academic users at https://metagenomics.atc.tcs.com/binning/binpairs.
منابع مشابه
Classification of metagenomic sequences: methods and challenges
Characterizing the taxonomic diversity of microbial communities is one of the primary objectives of metagenomic studies. Taxonomic analysis of microbial communities, a process referred to as binning, is challenging for the following reasons. Primarily, query sequences originating from the genomes of most microbes in an environmental sample lack taxonomically related sequences in existing refere...
متن کاملCOCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge
Motivation The advent of next-generation sequencing technologies enables researchers to sequence complex microbial communities directly from the environment. Because assembly typically produces only genome fragments, also known as contigs, instead of an entire genome, it is crucial to group them into operational taxonomic units (OTUs) for further taxonomic profiling and down-streaming functiona...
متن کاملFast and Accurate Taxonomic Assignments of Metagenomic Sequences Using MetaBin
Taxonomic assignment of sequence reads is a challenging task in metagenomic data analysis, for which the present methods mainly use either composition- or homology-based approaches. Though the homology-based methods are more sensitive and accurate, they suffer primarily due to the time needed to generate the Blast alignments. We developed the MetaBin program and web server for better homology-b...
متن کاملBusyBee Web: metagenomic data analysis by bootstrapped supervised binning and annotation
Metagenomics-based studies of mixed microbial communities are impacting biotechnology, life sciences and medicine. Computational binning of metagenomic data is a powerful approach for the culture-independent recovery of population-resolved genomic sequences, i.e. from individual or closely related, constituent microorganisms. Existing binning solutions often require a priori characterized refer...
متن کاملModified Bootstrapping and K-Means Clustering for Taxonomic Binning
Metagenomics is the study of microbial ecology using genetics as an access point. We seek to understand the microbial communities in environments such as tidal pools, soil, mine runoff, or even the human gut, so that we can understand the impact that microbes have on our world and our health. Metagenomic analysis usually involves the determination of what species are present in a given sample, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 9 شماره
صفحات -
تاریخ انتشار 2014